Here I briefly discuss loading and visualizing raw TIFF data. Typically I focus on data in tabular form post segmentation (not the raw images), but it’s helpful to see what they look like.
The following image is a single ROI from a tissue slice from a patient with non small cell lung carcinoma.Cell and tissue segmentation and cell phenotyping were performed using inForm software. For each image, there are multiple Tiff files output by inForm:
lung_component_data.tif: the multichannel file we are
most interested inlung_binary_seg_maps.tif: 3 channel file with binary
segmentation mapslung.tif: A composite image in RGB format.
# load relevant libraries
library(tidyverse)
library(tiff)
library(EBImage)
# define your specific file path
my_path = "./Data/"
# define file for different .tif files
img = paste0(my_path, "lung_component_data.tif")
segmentation = paste0(my_path, "lung_binary_seg_maps.tif")
composite_tif = paste0(my_path, "lung.tif")
The component.tif file contains the multichannel tiff,
which is a stack of individual images (one for each marker). Below I
define a function to read the image stack file and extract metadata.
# this function is from phenoptr package on GitHub
# extracts metadata specific to Vectra3/VectraPolaris images
read_image_stacks <- function(path) {
stopifnot(file.exists(path), endsWith(path, 'component_data.tif'))
tif = tiff::readTIFF(path, all=TRUE, info=TRUE)
# Get the image descriptions and figure out which ones are components
infos = purrr::map_chr(tif, ~attr(., 'description'))
images = grepl('FullResolution', infos)
# Get the image_stack names
names = stringr::str_match(infos[images], '<Name>(.*)</Name>')[, 2]
purrr::set_names(tif[images], names)
}
image_stack = read_image_stacks(img)
The image_stack object is now a list, where each list
element is a different marker
map(image_stack, dim)
## $`CD19 (Opal 650)`
## [1] 1008 1348
##
## $`CD3 (Opal 520)`
## [1] 1008 1348
##
## $`CD14 (Opal 540)`
## [1] 1008 1348
##
## $`CD8 (Opal 620)`
## [1] 1008 1348
##
## $`HLADR (Opal 690)`
## [1] 1008 1348
##
## $`CK (Opal 570)`
## [1] 1008 1348
##
## $`DAPI (DAPI)`
## [1] 1008 1348
##
## $Autofluorescence
## [1] 1008 1348
I can use the display() function from the
EBImage package to plot an individual channel. Below the
DAPI (nucleus) channel is plotted.
dapi_channel = t(image_stack$"DAPI (DAPI)")
EBImage::display(dapi_channel, method = "raster")
Now we plot the CD3 channel (a protein marker for T-cells).
tcell_channel = t(image_stack$"CD3 (Opal 520)")
EBImage::display(tcell_channel, method = "raster")
This data was collected on a Vectra 3 instrument and segmented using inForm tissue analysis software from Akoya Biosciences. This is proprietary image analysis software that can perform cell and tissue segmentation and cell phenotyping. The segmentation below were performed by that software.
# source phenoptr code for loading file and metadata, based on tiff::readTIFF.
source("https://raw.githubusercontent.com/akoyabio/phenoptr/main/R/read_maps.R")
segmentations = read_maps(segmentation)
names(segmentations)
## [1] "Nucleus" "Membrane" "Tissue"
Nucleus segmentation
nucleus = segmentations[['Membrane']]
plot(as.raster(nucleus, max = max(nucleus)))
Nucleus segmentation
nucleus = segmentations[['Nucleus']]
plot(as.raster(nucleus, max = max(nucleus)))
# source phenoptr code for loading file and metadata, based on tiff::readTIFF.
source("https://raw.githubusercontent.com/akoyabio/phenoptr/main/R/read_composites.R")
composite = read_composites(composite_tif)
composite = as.raster(composite[[1]])
plot(composite)
rm(dapi_channel, image_stack, nucleus,segmentation, tissue, composite, dapi)
## Warning in rm(dapi_channel, image_stack, nucleus, segmentation, tissue, :
## object 'tissue' not found
## Warning in rm(dapi_channel, image_stack, nucleus, segmentation, tissue, :
## object 'dapi' not found
The VectraPolarisData ExperimentHub package provides two large multiplex immunofluorescence datasets collected using Akoya Biosciences Vectra 3 and Vectra Polaris platforms. Image preprocessing (cell segmentation and phenotyping) was performed using inForm software. Data are provided as objects of class SpatialExperiment.
# load libraries
library(tidyverse)
The data is publicly available on Bioconductor as the package
VectraPolarisData. You can install the package from
Bioconductor here:
if (!requireNamespace("BiocManager", quietly = TRUE)) {
install.packages("BiocManager")
}
BiocManager::install("VectraPolarisData")
Load the high grade serous ovarian cancer data. This dataset has been segmented and phenotyped, and has one image per patient. I will be working with a form of this data that has already been preprocessed into a tidy dataframe.
# load processed ovarian cancer data
load(url("https://github.com/julia-wrobel/MI_tutorial/raw/main/Data/ovarian.RDA"))
The resulting dataframe, ovarian_df, contains
information on marker intensities, cell X and Y position, cell
phenotypes, and patient-level characteristics.
ovarian_df %>%
select(contains("id"),tissue_category, cd8, phenotype_cd68, x, y, everything()) %>%
as_tibble() %>%
head()
## # A tibble: 6 × 35
## sample_id cell_id slide_id tissue_category cd8 phenotype_cd68 x y
## <dbl> <dbl> <chr> <chr> <dbl> <chr> <dbl> <dbl>
## 1 68 1 030120 P… Stroma 0.251 "" 12185. 34524.
## 2 68 2 030120 P… Tumor 0.276 "" 12233 34524.
## 3 68 3 030120 P… Stroma 0.279 "" 12254. 34524.
## 4 68 4 030120 P… Stroma 0.48 "CD68-" 12031. 34525.
## 5 68 5 030120 P… Stroma 0.422 "CD68-" 11848. 34531.
## 6 68 6 030120 P… Stroma 0.279 "CD68-" 11875. 34535.
## # ℹ 27 more variables: primary <dbl>, recurrent <dbl>, treatment_effect <dbl>,
## # stage <chr>, survival_time <dbl>, death <dbl>, BRCA_mutation <dbl>,
## # age_at_diagnosis <dbl>, time_to_recurrence <chr>, parpi_inhibitor <chr>,
## # debulking <chr>, stage_bin <dbl>, phenotype_ki67 <chr>, phenotype_ck <chr>,
## # phenotype_cd19 <chr>, phenotype_p_stat3 <chr>, phenotype_cd3 <chr>,
## # phenotype_cd8 <chr>, ck <dbl>, ki67 <dbl>, ier3 <dbl>, pstat3 <dbl>,
## # cd3 <dbl>, cd68 <dbl>, cd19 <dbl>, dapi <dbl>, immune <chr>
Below we plot the cells for a single subject. In this dataset there is one image for each subject, and the image comes from a tumor microarray (TMA).
set.seed(7332)
id = sample(ovarian_df$sample_id, 1)
ovarian_df %>%
filter(sample_id == id) %>%
ggplot(aes(x, y)) +
geom_point(aes(color = tissue_category), size = 0.2)
ovarian_df %>%
filter(sample_id == id) %>%
mutate(phenotype_cd68 = ifelse(phenotype_cd68 == "CD68+", "CD68+", "CD68-")) %>%
ggplot(aes(x, y)) +
geom_point(aes(color = phenotype_cd68), alpha = 0.8, size = 0.7)
Load the non small cell lung carcinoma data and inspect the
SpatialExperiment object. This dataset has 3-4 images per
patient, representing different regions of interest (ROIs) from a tissue
slice. In this dataset, each patient was imaged on a separate slide.
# load processed ovarian cancer data
load(url("https://github.com/julia-wrobel/MI_tutorial/raw/main/Data/lung.RDA"))
set.seed(23423)
id = sample(lung_df$patient_id, 1)
lung_df %>%
filter(patient_id == id) %>%
ggplot(aes(x, y)) +
geom_point(aes(color = tissue_category), size = 0.3) +
facet_wrap(~image_id)
The spatstat package in R has great
resources for analyzing spatial point patterns. Let’s use the image
above to do some exploratory analysis with spatstat.
We will do spatial analysis on the distribution of B-cells and
macrophages, so we subset to only these cell types, and create a
phenotype variable that designates whether a cell is a B
cell or a macrophage. Then we create a ppp object, which is
what the spatstat package uses for storage and analysis of
point pattern data.
subj_df = filter(ovarian_df, sample_id == 7)
library(spatstat)
subj_df = subj_df %>%
# create a phenotype variable that is B-cell, macrophage, or other
mutate(phenotype = case_when(
phenotype_cd68 == "CD68+" ~ "macrophage",
phenotype_cd19 == "CD19+" ~ "B-cell",
TRUE ~ "other"
)) %>%
filter(phenotype != "other") %>%
select(x,y, immune, tissue_category, phenotype)
# first define window of observation for your image
w = convexhull.xy(subj_df$x,subj_df$y)
# create ppp object as multitype point pattern
# need to factor the marks variable for ppp object to be treated as a multitype point pattern
ovarian_pp = ppp(subj_df$x,subj_df$y, window = w,
marks = factor(subj_df$phenotype))
First we calculate Ripley’s K for B cells.
k_bcell = Kest(subset(ovarian_pp, marks == "B-cell"), correction = "isotropic")
We can plot this k_bcell object using
spatstat to visualize the estimated K (\(\hat{K}^{iso}\)) compared to the K under
CSR (\(K^{pois}\)).
plot(k_bcell)
Now visualize the K-function for macrophages in this image:
k_mac = Kest(subset(ovarian_pp, marks == "macrophage"),
correction = "isotropic")
plot(k_mac)
It appears that there is some degree of clustering for both B-cells and macrophages in this image. Are these statistically significant?
We can test this using the envelope function, which
performs Monte Carlo simulations of the K function under CSR. Plot below
shows pointwise envelope for B-cell K-function. Pointwise means that
these are performed separately for each value of r. Can only be
interpreted if a specific value of r is chosen in advance.
e_test = envelope(subset(ovarian_pp, marks == "B-cell"),
Kest, correction = "isotropic", global = TRUE)
## Generating 99 simulations of CSR ...
## 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
## 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
## 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
## 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80,
## 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98,
## 99.
##
## Done.
plot(e_test)
Points outside the envelope indicate significant departure from CSR.
Note that this could be due to inhomogeneity in the distribution of
cells, or due to clustering. See spatstat::Kinhom function
for a test that takes into account inhomogeneity. See
spatstat::Lest function to compute the L-function.
The bivariate K function measures whether the spatial distribution of 2 cell types is independent. Deviation from the \(K^{pois}\) line suggests a spatial relationship between the B-cells and macrophages in the image.
k_biv = Kcross(ovarian_pp, "B-cell", "macrophage", correction = "isotropic")
plot(k_biv)
Below are original citations for the two example datasets I will be working with in this tutorial.